Keyword [P-Darts]
Chen X, Xie L, Wu J, et al. Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation[J]. arXiv preprint arXiv:1904.12760, 2019.
1. Overview
1.1. Motivation
1) There is a large gap between the architecture depths in search and evaluation scenarios.
2) Darts searchs in a shallow network and evaluates in a deeper one.
3) Darts lacks of stability and can be biased heavily towards skip-connect.
In this paper, it proposes P-Darts
1) the depth of searched architectures to grow gradually during the training procedure.
2) multiple search stages. 3 stages
3) reduced searching time (~7 hours on a single GPU)
1.2. Technique
1.2.1 Search Space Approximation
- depth increases with candidate operations decreases
- solve the problem of exponentially increasing
1.2.2 Search Space Regularization
- operation (skip-connect) level dropout & control the appearance of skip-connection
- solve the problem of instability
1.3. Dataset
1) CIFAR10
2) CIFAR100
2. P-Darts
2.1. Search Space Approximation
- In final stage, keep two top-weighted non-zero operations
2.2. Search Space Regularization
observe that information prefers to flow through skip-connect instead of other
1) insert operation level dropout after each skip-connect
gradually decay the Dropout rate during the training process in each search stage
2) control the number of skip-connect to be M in final stage
if searched number of skip-connect is not M, choose top-M operation and set other to 0, redo cell construction. And repeate.
3. Experiments
3.1. Details
- 3 stages. 5,11,17 cells; 8,5,3 operation; 0.0, 0.4, 0.7 on CIFAR10; 0.1, 0.2, 0.3 on CIFAR100
- M = 2 at most